CS 294-1: Assignment 1 Naive Bayes Classification with Improvements

نویسنده

  • Shaunak Chatterjee
چکیده

The main objective of this assignment was to implement a Naive Bayes classifier and attempt certain improvements upon the vanilla version. A major challenge was to implement the classifier in Scala using the two libraries scalala and scalanlp. This report presents details regarding the different experiments I tried out, namely varying the smoothing parameter, feature selection, n-gram models and mixture of models.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

CS 294 - 1 Assignment 1 Report

Text classification has increasing potential applications in many aspects of information world, such as recommender systems and customer service. The goal of this assignment is to apply Naive Bayes classifier to a data set of labeled textual movie reviews and practice Scala/ScalaNLP. The data set “Polarity dataset v2.0” is from http://www.cs.cornell.edu/People/pabo/movie-reviewdata/, created by...

متن کامل

A New Approach for Text Documents Classification with Invasive Weed Optimization and Naive Bayes Classifier

With the fast increase of the documents, using Text Document Classification (TDC) methods has become a crucial matter. This paper presented a hybrid model of Invasive Weed Optimization (IWO) and Naive Bayes (NB) classifier (IWO-NB) for Feature Selection (FS) in order to reduce the big size of features space in TDC. TDC includes different actions such as text processing, feature extraction, form...

متن کامل

In silico prediction of anticancer peptides by TRAINER tool

Cancer is one of the causes of death in the world. Several treatment methods exist against cancer cells such as radiotherapy and chemotherapy. Since traditional methods have side effects on normal cells and are expensive, identification and developing a new method to cancer therapy is very important. Antimicrobial peptides, present in a wide variety of organisms, such as plants, amphibians and ...

متن کامل

CS 294 - 1 A 1 : Naive Bayesian Classifier

Settings. Our codes were written in Scala and compiled under Simple Build Tool (SBT). The programs were run on Mac OS. We test the effectiveness of our implementation in various aspects. If not mentioned explicitly, we adopt the following default settings. We report macroaveraged F1 measures, which were further averaged by ten-fold cross validations. We consider both “Bernoulli” and “Multinomia...

متن کامل

Weighted Naive Bayes Classifier: A Predictive Model for Breast Cancer Detection

In this paper investigation of the performance criterion of a machine learning tool, Naive Bayes Classifier with a new weighted approach in classifying breast cancer is done . Naive Bayes is one of the most effective classification algorithms. In many decision making system, ranking performance is an interesting and desirable concept than just classification. So to extend traditional Naive Baye...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012